3
Algorithms for Binary Neural
Networks
3.1
Overview
The most extreme quantization in the quantization area is binarization, which is the focus
of this book. Data can only have one of two potential values during binarization, which
is a 1-bit quantization: −1 (or 0) or +1. Both weight and activation can be represented
by a single bit in network compression without consuming a lot of memory. In addition,
binarization replaces costly matrix multiplication operations with lighter bitwise XNOR
and Bitcount operations. Therefore, compared to alternative compression techniques, binary
neural networks (BNNs) have a variety of hardware-friendly advantages, such as significant
acceleration, memory savings, and power efficiency. The usefulness of binarization has been
demonstrated by ground-breaking work like BNN [99] and XNOR-Net [199], with XNOR-
Net being able to speed up CPUs by 58% and save up to 32 bytes of RAM for a 1-bit
convolution layer. Following the BNN paradigm, a lot of research has been done on this
topic in recent years from the field of computer vision and machine learning [84, 201, 153],
and it has been used for a variety of everyday tasks including image classification [48,
199, 159, 196, 267, 259], detection [263, 240, 264, 260], point cloud processing [194, 261],
object reidentification [262], etc. By transforming a layer from full precision to 1-bit, the
binarization approach intuitively makes it simple to verify the significance of a layer. If
performance suffers noticeably after binarizing a particular layer, we can infer that this layer
is on the network’s sensitive path. From the perspective of explainable machine learning, it
is also essential to determine if full-precision and binarized models operate similarly.
Numerous researchers have sought to shed light on the behaviors of model binarization,
as well as the relationships between the robustness of the model and the architecture of
deep neural networks, in addition to concentrating on the methods of model binarization.
This may aid in approaching solutions to fundamental queries of what network topology
is preferable and how the deep network functions. It is crucial to thoroughly explore BNN
studies because they will help us better understand the behaviors and architectures of
effective and reliable deep learning models. Some outstanding prior art reveals how BNN’s
components work. For example, Bi-Real Net [159] incorporates more shortcuts (Bi-Real)
to mitigate the information loss caused by binarization. This structure functions similarly
to the ResNet shortcut [84], which helps to explain why commonly used shortcuts can
somewhat improve the performance of deep neural networks. One thing that can be observed
by looking at the activations is that more specific information from the shallow layer can be
transmitted to the deeper layer during forward propagation. On the other hand, to avoid
the gradient vanishing problem, gradients can be directly propagated backward using the
shortcut. By building numerous weak classifier groups, some ensemble approaches [301]
improve BNN performance but occasionally run into overfitting issues. Based on analysis
and testing with BNNs, they demonstrated that the number of neurons trumps bit width
DOI: 10.1201/9781003376132-3
37